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The methodology of Markov basis initiated by Diaconis and Sturmfels^ stimu- 
lated active research on Markov bases for more than ten years. It also motivated 
improvements of algorithms for Grobner basis computation for toric ideals, such 
as those implemented in 4ti2.^ However at present explicit forms of Markov 
bases are known only for some relatively simple models, such as the decompos- 
able models of contingency tables. Furthermore general algorithms for Markov 
bases computation often fail to produce Markov bases even for moderate-sized 
models in a practical amount of time. Hence so far we could not perform ex- 
act tests based on Markov basis methodology for many important practical 
problems. 

In this article we propose to use lattice bases for performing exact tests, in 
the case where Markov bases are not known. Computation of lattice bases is 
much easier than that of Markov bases. With many examples we show that the 
approach with lattice bases is practical. We also check that its performance is 
comparable to Markov bases for the problems where Markov bases are known. 
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1. Introduction 

Since Diaconis and Sturmfels^ introduced a Markov basis and proposed an 
algorithm of exact test by sampling contingency tables sharing a sufficient 
statistic, the algebraic and statistical properties of Markov bases for toric 
models have been extensively studied. Once a Markov basis is given, we can 
perform an exact test by using the basis. There exist algebraic algorithms to 
compute a Markov basis and a Markov basis of models for relatively small 
contingency tables can be computed by a computer algebra system such as 
4ti2.^ However the computational cost of these algorithms is very high and 
at present it is difficult to compute a Markov basis for even moderate-sized 
models by such softwares in a practical amount of time. 

For important models for applications we can investigate the structure 
of Markov bases for the model. In general, however, the structure is compli- 
cated and explicit forms Markov bases are known only for a few models such 
as the decomposable model,^ no-three-factor interaction model for relatively 
small tables.* Considering the fact that an exact test is needed especially 
when the sample size is relatively small for the degrees of freedom of the 
model and the chi-square approximation of a test statistic is not accurate, 
these results at this point are not satisfactory from a practical viewpoint. 

The set of c;ontingency tables sharing a sufficient statistic is called a 
fiber. Markov basis is defincid as a set of moves connecting every fiber. 
One reason for the complexity of Markov bases is that they guarantee the 
connectivity of every fiber. In practice, we only need to connect a fiber 
which a given data set belongs to. Sometimes we can find a useful subset of a 
Markov basis which has a simple structure and guarantees the connectivity 
of particular fibers. ^"^ However, again, such a subset is not easy to obtain 
in general.® 

In view of these difficulties with Markov bases, for performing exact 
tests we propose to use a lattice basis, which is a basis of the integer kernel 
of a configuration matrix, instead of a Markov basis. Computation of lattice 
bases is much easier than computation of Markov basis. With many exam- 
ples we show that the proposed approach is practical. Note that a lattice 
basis itself does not guarantee the connectivity of every fiber. However ev- 
ery move is written as an integer combination of elements of a lattice basis. 
Hence, if we generate moves in such a way that every integer combination 
of elements of a lattice basis has a positive probability, then we can indeed 
guarantee the connectivity of every fiber. 

When we run a Markov chain over a fiber, the transition probabili- 
ties can be easily adjusted by the standard Metropolis-Hastings procedure. 
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Hence we can use any probability distribution for generating the moves, 
as long as every integer combination of elements of a lattice basis has a 
positive probability. 

Based on the above observations, in this paper we discuss sampling of 
contingency tables by using a lattice basis. Wc propose simple algorithms 
for generating moves such that every move is generated with a positive 
probability by using a lattice basis. We can apply the proposed method 
to models whose Markov basis is not easy to compute and we show the 
usefulness of the proposed method through numerical experiments. 

The organization of the this paper is as follows. In Section 2 we give a 
brief review on a Markov basis and lattice basis. In Section 3 we propose 
algorithms for generating moves by using lattice basis and in Section 4 
we show the practicality and usefulness of the proposed method through 
numerical experiments. 

2. Markov basis and lattice basis 

In this section wc give a brief review on a Markov basis and a lattice basis. 
Let X = G I] denote a contingency table, where x{i) is a cell 

frequency for a cell i and I is the set of cells. When we order the elements 
of X appropriately, x is considered as an |I| dimensional column vector. Let 
t denote the vector of the sufficient statistic for a toric model. In a toric 
model there exists an integer matrix A satisfying 

Ax = t. 

A is called a configuration matrix associated with the model. The set of 
contingency tables sharing t is called a fiber and denoted by J^t- 

Consider a goodness-of-fit test for the model. When t is fixed, x is 
distributed exactly as a hypergcomctric distribution over the fiber J-t- If 
we can enumerate the elements of the fiber, it is possible to evaluate a 
test statistic based on the exact hypergeometric distribution. In general 
the enumeration is infeasible and the evaluation of the distribution of a 
test statistic is done by sampling contingency tables. 

Let 

kerz A = ker A n z'^l = {z e z'^l \ Az = Q} 

denote the integer kernel of A. An element of kerz A is called a move for the 
model. By adding or subtracting a move z = {z{i),i G 1} G ker^ A, a con- 
tingency table X is transformed to a table in the same fiber y = x+z, as long 
as y does not contain a negative cell. A finite set of moves B = {zi, . . . , zm} 
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is called a Markov basis if for every fiber all the states become mutually 
accessible by moves in B. Consider an undirected graph Gt.B whose vertices 
are the elements of a fiber We draw an edge between x & J^t and y & J-t 
if there exists z G B such that y = x + zovy = x — z. B = {zi, . . . , zm} 
is a Markov basis if and only if Gt.B is ("onncicted for all t. In this way 
a Markov basis guarantees the connectivity of every fiber. Combined with 
the standard Metropolis-Hastings procedure, the connectivity enables us to 
sample contingency tables from an irreducible Markov chain whose station- 
ary distribution is the hypergeometric distribution by Markov chain Monte 
Carlo (MCMC) method. Therefore once a Markov basis is obtained, we can 
evaluate the distribution of a test statistic of a conditional test based on 
the exact distribution. 

A move z is written as a difference of its positive part and negative part 
as z = z+ — z~, where z'^{i) = max(z(i),0) and = max(— 0), 

i & I. Consider a binomial p'' — p" corresponding to z, where = 
riiei-PC^)^ and p{i) arc indctcrminates. The degree of the binomial 
p^ — p^ is called the degree of z. Let I a be the toric ideal associated 
with a configuration A. Then p^* — p^ € Ia if and only if z = z+ — z~ 
is a move. Algebraically a Markov basis is defined as a generator of the 
toric ideal I a- A Grobner basis of I a forms a Markov basis. ^ A Markov 
basis or a Grobner basis of models for relatively small contingency tables 
can be computed by a computer algebra system such as 4ti2.^ However the 
computational cost is very high and for even moderate-sized models it is 
difiicult to compute a Markov basis or Grobner basis in a practical amount 
of time. 

Let d = dimkerA = |I| — rank A be the dimension of linear space 

spanned by the elements of kci A in M'-^'. It is a standard fact that the 
integer lattice kerz A possesses a lattice basis C = {zi, . . . , z^}, such that 
every z € kerz A is a unique integer combination of Zi, . . . , z^.^ Given A, 
it is relatively easy to compute such a basis of kerz A using the Hermite 
normal form of A. 

Usually a lattice basis contains exactly d elements. In this paper we 
allow redundancy of a lattice basis and call a finite set jC of moves a lattice 
basis if every move is written by an integral combination of the elements 
of £. As we mentioned it is relatively easy to compute a lattice basis for a 
given A. Also, for many statistical models, where a Markov basis is hard 
to obtain, we can more easily identify a lattice basis. An example of this is 
the Lawrence lifting discussed in Section 3.2. 

Let S" be a polynomial ring and let Ic = {p^ | z G £) be the ideal 
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generated by a lattice basis £. The toric ideal Ia is obtained from Ic by 
taking saturation^^'^^ 



Intuitively this fact shows that when the frequency of each cell is sufficiently 
large, the fiber is connected by the lattice basis C 

3. Sampling contingency tables with a lattice basis 

111 this section we propose algorithms to generate a move based on a lattice 
bases. We also give lattice bases for higher Lawrence configurations. 

3.1. Generating moves by using a lattice basis 

Assume that C = {zi, . . . ,Zk}, K > d, is a lattice basis. Then any move 
z e kerz A is expressed as 



Then we can generate a move z by generating the integer coefficients 
ai, . . . , ax- In the numerical cxperiiiiciiits in the next section we use the 
following two methods to generate ai, . . . , ax- Both methods generate all 
integer combinations of elements of £ with positive probabilities and hence 
guarantee the connectivity of all fibers. 

Algorithm 3.1. 

Step 1 Generate |ai|, . . . , lax] from Poisson distribution with mean X, 



and exclude the case \ai\ = • • • = \aK\ = 0. 
Step 2 afe \ak\ or ak < \cxk\ with probability 1/2 for k = 1,. . . ,K. 

Algorithm 3.2. 

Step 1 Generate \a\ = fi^om geometric distribution with param- 




z = aiZi-\ \-aKZK, ai, 



. . . , ax e 



\ak\'^PoiX) 



eter p 



|a| ~ Geom{p) 
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and allocate \a\ toai,. . . , ax according to multinomial distribution 
ai, . . . , ~ Mn{\a\; 1/K, l/K) 
Step 2 Q!/s ■<— \ak\ or ak < |Q!fe| with probability 1/2 for k = 1, . . . , K . 



3.2. A lattice basis for higher Lawrence configuration 

Consider a configuration matrix of the form 



A(^) 



A 
/ I 



where I is an identity matrix. A{A) is called the Lawrence lifting of ^ or a 
Lawrence configuration.^^ More generally the r-th Lawrence configuration 
is defined by 



r-l 



(A 

A 



aW(A) 



0\ 




... 



A 



(1) 



Many practical statistical models including the no-three-factor interaction 
model and the discrete logistic regression model discussed in the following 
section have Lawrence configurations. In general a Markov basis for the 
Lawrence configuration is very difficult to compute.^''' On the other hand 
it is easy to compute a lattice basis and the proposed method is available 
even for such models. We can compute a lattice basis of bS'^^i^A) by the 
following propositions. 

Proposition 3.1. Let the column vectors of a matrix B form a lattice basis 
of A. Then the column vectors of ^ form a lattice basis of A{A). 

Proof. Let x and y be two contingency tables in the same fiber for A{A). 
Let \T\ = 2n be the number of cells. Then we note that n is the number 
of columns of A. Write x = {x'i,x'2)' , where x\ and X2 are n x 1 column 
vectors and ' denotes the transpose. In the same way, write y = (2/1,^2)'- 
Let 



X 



y 



Xi 
X2 



yi 
y2 
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be a move of A{A). Since Azi = 0, zi is written by an integer linear 
combination of B as Zi = Bex, where cc is an Z x 1 integer vector. zq+Zi = 
implies that Zi = —Bex and therefore 



z = 



B 

-B 



B 



Hence ( ^ ) form a lattice basis of A(^). 



□ 



Proposition 3.2. Let the column vectors of B form a lattice basis of A. 
Then the column vectors of 



( B 
B 



\ 



: ■■• ■•. 
■■■ B 
\-B -B bJ 



(2) 



form a lattice basis of higher Lawrence configuration A^'^^ {A) . 



Proof. We can interpret the r-th Lawrence lifting as r slices of the original 
contingency table corresponding to A. The number of the cells for A^""-* (A) 
is |I| = rn, where n is the number of cells (columns) of A. Let 







- yi\ 












-VrJ 



be a move of A^^\A). We can express zi = Bai. Then using the r-th slice 
as "pivots" we can write 









\-B/ 



I 







\ 



Zr-l 

\zr + Bail 



Note that the first block of z is now eliminated. Performing the same op- 
eration recursively to other blocks we are left with the (r — l)-th slice and 
r-th slice, which is the same as the previous proposition. □ 
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In this proposition wc only used the last slice as pivots. More symmetric 
lattice basis can be obtained by columns of all pairwise differences of slices, 
for example for r = 3 

B B \ 
-BO B \ . 
-B-Bj 

The lattice bases in the above propositions may contain redundant el- 
ements. However the set of moves including redundant elements arc some- 
times preferable for moving around the fiber. In general the computation 
of a lattice basis of A is easier than the computation of a lattice basis of 
A^'^\A). Sometimes we can compute a Markov basis for A even when it is 
difficult to compute a Markov basis of A^'^'>{A). If a Markov basis for A is 
known, we can use it as a lattice basis for A and apply the above propo- 
sitions for obtaining a lattice basis of A('")(A). In the following numerical 
experiments we compute a lattice basis by using the above propositions. 

4. Numerical experiments 

In this section wc apply the proposed method to the no-three-factor in- 
teraction model and the discrete logistic regression model and show the 
usefulness of the proposed method. 

4.1. No-three-factor interaction model 

No-three-factor interaction model is a model for three-way contingency ta- 
bles. Let Xi-^i^i^ and Pi^i2i3 denote a cell frequency and a cell probability 
of a cell i = i2, is) of a three-way contingency table, respectively. Then 
the model is described as 

logftiiais = Ml2(«l«2) + M23(«2i3) + ^J'Zl{^3h), 

where Hi2, /X23 and /i3i are free parameters. Aoki and Takcmura^ discussed 
the structure of Markov basis for 3 x 3 x table in detail and showed that 
there exists a Markov basis such that the largest degree of moves is 10. In 
general, however, the structure of Markov bases for this model is known 
to be complicated and the closed form expression of Markov bases for this 
model of general tables is not yet obtained at present. Even by using 4ti2, 
it is difficult to compute a Markov basis for contingency tables larger than 
5x5x5 tables within a practical amount of time. 
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This model has the higher Lawrence configuration in (1) such that A is 
a configuration for the two-way complete independence model. The set of 
basic moves of form 





ii 


^[ 




1 


-1 


i'2 


-1 


1 



is known to be a Markov basis for the two-way complete independence 
model. By using this fact and Proposition 3.2, we can compute a lattice 
basis as a set of degree four moves, 



«3 





h 


^'l 




1 


-1 


^'2 


-1 


1 





h 




«2 


-1 


1 


^'2 


1 


-1 



In this experiment we compute an exact distribution of the log-likelihood 
ratio (LR) statistic of the goodness-of-fit test for no-three-factor interaction 
model against the three-way saturated model 

logPiiisis = Ml23(il«2«3)- 

We computed sampling distribution of the LR statistic for 7 x 7 x /, 7 = 

3, 5, 10 three-way contingency tables. Then the degrees of freedom of the 
asymptotic distribution of LR statistic is (7—1)'^. We set the sample size 
as 57^. For 3x3x3 tables, the number of burn- in samples and iterations are 
(burn-in, iteration) = (1000, 10000). In 3 x 3 x 3 tables, a minimal Markov 
basis is known"* and we also compute a sampling distribution by a Markov 
basis. In other cases, we set (burn-in, iteration) = (10000, 100000). 

Figure 1 presents the results for 3 x 3 x 3 tables. Left, center and right 
figures are histograms, paths and correlograms of the LR statistic, respec- 
tively. Solid lines in the left figures are asymptotic distributions with 
degrees of freedom 8. ak is generated from Po(A), A = 1, 10, 50. 

We can see from the figures that the proposed methods show compara- 
tive performance to the sampling with a Markov basis. Although the sam- 
pling distribution and the path is somewhat unstable for A = 50, in other 
cases the sampling distributions are similar and the paths are stable after 
burn-in period. Unless we set A as extremely high, the proposed method is 
robust against the distribution of . 

Figure 2 presents the results for 5 x 5 x 5 and 10 x 10 x 10 tables. In 
these cases Markov basis cannot be computed via 4ti2 within a practical 
amount of time by an Intel Core 2 Duo 3.0 GHz CPU machine. So we 
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compute sampling distributions by the proposed method. For 5x5x5 
tables, ai, . . . , are generated from Geom{p)^ p ~ 0.1, 0.5. The degrees 
of freedom of the asymptotic distribution is 64. Also in this case we 
can see that the proposed methods perform well. The approximation of the 
sampling distributions to the asymptotic distribution is good and the 
paths are stable after burn-in period. 

For 10 X 10 X 10 tables, ai, . . . , ax are generated from Po(A), A = 10, 50. 
The degrees of freedom of the asymptotic distribution is 729. In this 
case the performances of the proposed methods look less stable. We also 
compute the cases where the sample sizes are 10/"^ and 100/"^ but the results 
are similar. This is considered to be because the size of fibers of 10 x 10 x 10 
tables is far larger than those of 3x3x3 or 5x5x5 tables and it is more 
difficult to move around all over a fiber. Even if we use a Markov basis, the 
result might not be improved. Increasing the number of iterations might 
lead to a better performance. 

Comparing the paths with A = 10, the path with A = 50 looks relatively 
more stable. For larger tables, larger A might be preferable to move around 
a fiber. 



4.2. Discrete logistic regression model 

The logistic regression model with discrete covariatcs is considered as a 
model for contingency tables. The model is defined by the conditional prob- 
ability for the response variable. The model with one covariate and the 
model with two covariates are described as 

exp(^ij + ai^i2) 



Pil\i2 = S 



1 



. l + E!;=i^exp(/i + Q;i'^i2)' 
where i2 S X2 and 



ii = 1, . . . ,7i - 1, 

ii = h, 



_ I 1 + Ei;=i exp(Mii + ai'j2 + ft; is) ' 



. 1 + Y.i[=i cxp(Aii'i + a,'^i2 + ft;«3) 



h = h, 



1, 



where (i2,«3) 6 X2 x^s, respectively. Pi^ii^ andpjj^ij^ig arc conditional prob- 
abilities that the value of the response variable equals ii given the covari- 
atcs i2 and (^27*3), respectively. X2 and I2 x I3 are designs for covariates. 
The structure of Markov bases for discrete logistic regression model is also 
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known to be complicated even for the case of binary responses Ii = 2.^'"^ 
Chen ct al.^ and Hara ct al7 discussed the model with one covariatc which 
is discrete and equally spaced and showed that the set of degree four moves 
of form 





12 12 + k i'2 


-k 




ii 


1 -1 


-1 


1 




-1 1 


1 


-1 



connects all fibers. Hara ct al^ generalized the argument to the model with 
two covariates both of which are equally spaced. However it seems to be 
difficult to generalize these arguments to the models with more than two 
covariates or with more than two responses /i > 2 at this point. A Markov 
basis connecting all designs has to contain higher degree moves and the 
number of moves in a Markov basis is very large. Table 1 presents the 
highest degrees and the numbers of moves in the minimal Markov bases of 
binomial logistic regression models with one covariatc computed by 4ti2. 
Even for models with one covariatc, if a covariatc has more than 20 levels, 
it is difficult to compute Markov bases of models via 4ti2 within a practical 
amount of time by a computer with a 32-bit processor. 

The logistic regression model with r responses has the r-th Lawrence 
configuration (1) where A is a configuration for Poisson regression model. 
The computation of Markov bases of Poisson regression model is relatively 
easy. Therefore a lattice basis can be computed by Proposition 3.2 and we 
can apply the proposed method to these models. 



Table 1. The highest degrees and the number of moves in a minimal Markov 
basis for binomial logistic regression models with one covariate 





10 


11 


number of levels of a covariate 
12 13 14 15 


16 


maximum degree 
number of moves 


18 
1830 


20 
3916 


22 24 26 28 
8569 16968 34355 66066 


30 
123330 



In the experiment we considered the goodness-of-fit test of binomial or 
trinomial logistic regression model with two covariates against a model with 
three covariates 

I _^ , li = 1, . . . , Ji — i, 

■ ■ — — ' ^1=^1 
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where (^2,^3) e X2 x I3, 14 e I4. We use the LR statistic as a test statistic. 
Wc assume that I2 x X3 are 4x4 and 10 x 10 checkered designs as described 
in the following figure for the 4x4 case, where only (12, 13) in dotted patterns 
have positive frequencies. 



We also assume that I4 = {1,2,3,4,5}. The degrees of freedom of the 
asymptotic distribution of the LR statistic is 1. We set the sample sizes 
for 4 X 4 and 10 x 10 designs are 200 and 625, respectively. We also set 
(burn-in, iteration) = (1000,10000). 

Figures 3 and 4 present the results for a binomial and a trinomial logistic 
regression models with 4x4 checkered pattern, respectively. Solid lines in 
the left figures are asymptotic distributions, is generated from Po(A), 
A = 1, 10,50. We can compute Markov bases in these models. So we also 
present the results for Markov bases. We can see from the figures that the 
proposed methods show comparative performance to a Markov basis also 
in these models. We note that the paths are also stable even for the case 
where ai, . . . , are generated from Po(50). 

Figure 5 presents the results for 10 x 10 checkered pattern. In this case 
Markov bases cannot be computed via 4ti2 by our machine, ak is generated 
from Geom{p), A = 0.1, 0.5. Also in these cases the results look stable. These 
results shows that the proposed method is useful for the logistic regression 
models for which that it is difficult to compute a Markov basis. 
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(b) a lattice basis with -Po(l) 
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(c) a lattice basis with Po(lO) 
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(d) a lattice basis with Po(50) 



Fig. 1. Histograms, paths of LR statistic and correlograms for 3x3x3 no-three-factor 
interaction model {(burn in, iteration) = (1000, 10000)) 
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(a) 5 X 5 X 5, a lattice basis with Geom(O.l) 
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(b) 5 X 5 X 5, a lattice basis with Geom(0.5) 
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(c) 10 X 10 X 10, a lattice basis with Po(lO) 
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(d) 10 X 10 X 10, a lattice basis with Po(50) 

Fig. 2. Histograms, paths of LR statistic and correlograms of paths for no-three-factor 
interaction model ({burn in, iteration) = (10000, 100000)) 
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(b) a lattice basis with -Po(l) 
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Fig. 3. Histograms, paths of LR statistic and correlograms of paths for discrete logistic 
regression model ((burn in, iteration) = (1000, 10000)) 
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(c) a lattice basis with Po(lO) 
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(d) a lattice basis with Po(50) 

Fig. 4. Histograms, paths of LR statistic and correlograms of paths for trinomial dis- 
crete logit model ((burn in, iteration) = (1000, 10000)) 
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(b) binomial, a lattice basis with Geom(0.5) 




(c) trinomial, a lattice basis with Geom(O.l) 




(d) trinomial, a lattice basis with 6*60771(0.1) 



Fig. 5. Histograms, paths of LR statistic and corrclograms of paths for discrete logistic 
regression model ((burn in, iteration) = (1000, 10000)) 



